Bagging Evolutionary ROC-based Hypotheses Application to Terminology Extraction

نویسندگان

  • Jérôme Azé
  • Mathieu Roche
  • Michèle Sebag
چکیده

The claim of the paper is that Evolutionary Learning is a source of diverse hypotheses “for free”, and this specificity can be used to combine in an ensemble the hypotheses learned in independent runs. The aim of our algorithm named Broger (Bagging-ROC GEnetic LEarneR) consists of optimizing the Area Under the ROC Curve using Evolutionary Learning. This paper first presents the theoretical framework of Broger and then its application to a Term Extraction task in Text Mining.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Interestingness Measures in Terminology Extraction. A ROC-based approach

In the field of Text Mining, a key phase in data preparation is concerned with the extraction of terms, i.e. collocation of words attached to specific concepts (e.g. Philosophy-Dissertation). In this paper, Term Extraction is formalized as a supervised learning task, extracting a ranking hypothesis from a set of terms labeled as relevant/irrelevant by the expert. This task is tackled using the ...

متن کامل

Preference Learning in Terminology Extraction: A ROC-based approach

A key data preparation step in Text Mining, Term Extraction selects the terms, or collocation of words, attached to specific concepts. In this paper, the task of extracting relevant collocations is achieved through a supervised learning algorithm, exploiting a few collocations manually labelled as relevant/irrelevant. The candidate terms are described along 13 standard statistical criteria meas...

متن کامل

Learning to Order Terms: Supervised Interestingness Measures in Terminology Extraction

Term Extraction, a key data preparation step in Text Mining, extracts the terms, i.e. relevant collocation of words, attached to specific concepts (e.g. genetic-algorithms and decisiontrees are terms associated to the concept “Machine Learning” ). In this paper, the task of extracting interesting collocations is achieved through a supervised learning algorithm, exploiting a few collocations man...

متن کامل

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

On the evolutionary design of heterogeneous Bagging models

Bagging is a popular ensemble algorithm based on the idea of data resampling. In this paper, aiming at increasing the incurred levels of ensemble diversity, we present an evolutionary approach for optimally designing Bagging models composed of heterogeneous components. To assess its potentials, experiments with well-known learning algorithms and classification datasets are discussed whereby the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005